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1. INTRODUCTION 

The industry has recently seen the advent of artificial intelligence, a field that makes machines 
smart enough to carry out certain tasks by learning on their own. One of its domains is Machine Learning 
(ML). Experts describe it as the “study that gives computers the [power] to learn without... 
explicit programming” [1]. ML is particularly powerful in the presence of massive amounts of data, often 
called “big data.” Increasingly large datasets enable increasingly accurate learning during the training of ML; 
these datasets can no longer be grasped by the human eye but can be scanned by computers running ML 
algorithms. With the advent of big data, both the amount of data available and our ability to process it has 
increased exponentially. The ability of machines to learn and thus appear ever more intelligent has increased 
proportionally. Machine learning is particularly suited to problems where: applicable associations or rules 
might be intuited, but are not easily codified or described by simple logical rules; potential outputs or actions 
are defined, but which action to take is dependent on diverse conditions that cannot be predicted or uniquely 
identified before an event happens; accuracy is more important than interpretation or interpretability; the data 
is problematic for traditional analytic techniques in specifically, wide data and highly correlated data. 

In this paper, we categorize how Machine Learning promises to assist or apply to problems in 
Engineering, Life Sciences, Finance, and Sports Analytics applications. It is making technology such as 
driver identification even better that has far-reaching implications into insurance and health safety [2]. 
Some neuroscience experts believe machine learning “should be in the toolbox of most systems 
neuroscientists.” [3] It is protecting us from “catastrophic consequences such as [market] blackouts,” due to 
cyber-attacks [4]. It’s also helping players improve in times where “top-class coaches are hard to find” [5]. 
Although ML has emerged as a widespread concept, its techniques and implementations remain known to a 
few. In this paper, we provide a review of various kinds of ML algorithms simplified for beginners. 
Additionally, we aim to demonstrate the versatility of these algorithms through a detailed explanation of 
specific examples to encourage all types of industrial practitioners and researchers to adopt these ideas into 
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their practices and gain better efficiency and accuracy. The rest of the paper is organized as follows: Section 
2 provides an overview of learning methods in machine learning. Section 3 discusses various applications of 
machine learning techniques in various fields like Health Sciences, Business, Autonomous Vehicles, Fraud 
Detection, Sports Analytics. Section 4 provides some comments on the challenges and future directions of 
the machine learning applications, and the final section concludes the paper. 


2. LEARNING ALGORITHMS 

In ML, there are three main types of learning - supervised, unsupervised, and reinforcement 
learning. Supervised learning uses prior data as a reference to generate models for predictions. However, 
unsupervised learning involves the interpretation of given data and creating demonstrative patterns from it, 
e.g. clustering, dimensionality reduction, and compression. Nevertheless, both techniques involve similar 
strategies and ideas to learn intuitively. The program must be flexible and robust enough to understand 
different datasets and adapt accordingly for accuracy. Reinforcement learning allows a system to learn the 
best actions based on the reward that occurs at the end of a sequence of actions. Most ML algorithms work 
with the framework of a cost function and a gradient. The cost function is usually a mean squared error 
function of the distances between the predictions and the expected answers. The gradient measures the slope 
of the tangent to the cost function and indicates how the parameters should adjust for better performance [1]. 


2.1. Supervised learning 

Every supervised ML situation has a dataset X, parameters 0, a hypothesis hg(X) = @'X, and an 
output variable y. With every iteration in the program, the machine adjusts parameters @ to take hg(X) as 
close as possible to y. @ can also be obtained using the normal equation: 9 = (X'X)1X'y. This y serves 
as an answer key, used by the algorithm to improve. To make sure @ doesn’t become overly complex and 
create models that try too hard to fit the data, we regularize 0 using A. Large values of © are penalized by 
adding values of @ to the cost function. As a result, @ takes on appropriate values. When the hypothesis 
predicts a rational value based on given data, it is called regression. On the other hand, if the hypothesis tries 
to label the data appropriately, it is called classification. 


2.1.1. Linear regression 

Linear regression is a popular technique for regression. Linear regression computes the cost function 
in (1) and the gradient in (2) on every iteration to reach the global minimum, a point where the cost function 
has the lowest value. This minimum value is achieved by adjusting 0 using the cost function and gradient in 
(3). Once achieved, we can be certain that the program found the most accurate parameters 0 to build the 
most accurate model [1], as shown in Figure 1. 


14 T T T T T 


© Training data 
Linear regression 


Height in meters 


0.8 


o7 1 L n n 
2 3 4 5 6 


Age in years 


“ib 
co 


Figure 1. Linear regression models 


IJ-AI Vol. 8, No. 4, December 2019: 411 — 421 


IJ-AI ISSN: 2252-8938 Oo 413 


J(8) = 5 OEa(ho(X) — y'))? + Tan OF (1) 
Get!) 3) 


2.1.2. Classification 

A slightly modified regression algorithm is referred as classification. Logistic regression works in 
the same way, but it has subtle differences in the specifics [6], as shown in Figure 2. While linear regression 
predicts continuous values, logistic regression must classify data into different types. So, it uses a probability 
model using the sigmoid function [7] as shown Figure 3. 
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Figure 2. Logistic regression example 


Logistic Regression in Rare Events Data 
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Figure 3. Sigmoid Function 


he(X) = a which ranges from 0 to 1. Each X has an associated y, which is a set of the 


probability that X is of one of the possible types. The type with the highest probability for that X is what X 
will likely be as given in (4-6). 
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The most commonly used supervised classification algorithms include neural networks, support 
vector machines, decision trees. Both linear and logistic regression can be represented by neural networks 
whose properties allow modeling more complex functions. 


— Artificial neural networks (ANN) 

ANN work like neurons in our body. They can take in various inputs and produce one output from 
the starting data or another neuron [8], as shown in Figure 4. In our situation, 0 represents the connections 
between the neurons, and the neurons themselves contain data in some form. The output layer is used for 
comparison with y. Cost function and gradient are done based on the type of regression chosen. The 
processing of the neural networks is intricate, though. The algorithm must pass through forward once to get 
predictions and use them to calculate /(@). The algorithm must also pass through backward to compute 
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Figure 4. Neural network architecture 


— Support vector machines (SVM) 

Logistic regression rests on fine margins, so Support Vector Machines can be used instead to 
simplify the task and classify it with more confidence. Support Vector Machine is considerably different but 
uses the same approach as a logistic regression classifier. Instead of regularizing the parameters, the 
algorithm regularizes the theoretical cost with variable Cas in (7). In other words, the algorithm develops 
robust decision boundaries for classification by forcing the theoretical cost to be low while allowing 
parameters 0 to adjust [9], as shown in Figure 5. The cost function is modified to make the cost 0 beyond a 
margin (usually 1) for positive and negative cases of z= @'X [10], as given in Figure 6. In logistic 
regression, hg(X) was one if z = 0 and 0 otherwise, but in SVMs, hg(X) is one if z > 1 and 0 ifz<-1. 
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2.2. Unsupervised learning 

Every unsupervised ML problem builds a model around the data given and tries to use it for a 
specific purpose [11]. The machine studies data to identify different patterns as sown in Figure 7. The 
machine determines correlations and relations by parsing the available data. Unsupervised learning is similar 
to human behavior: drawing inferences through observation and experience. An unsupervised algorithm, 
however, has some disadvantages as it provides too little information on the relative advantage and absolute 
reliability of the operation. Different unsupervised algorithms like Clustering K-Means, Principal Component 
Analysis, Anomaly Detection, Recommender Systems, and other algorithms are developed. 
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Figure 7. Unsupervised learning 


2.2.1. Clustering: K-means 

The purpose here is to separate the data into clear groups called clusters. K-means, an iterative 
algorithm, are preferred for clustering. First, randomly initialize K centroids, which will be used to find K 
groups in the dataset [12]. The algorithm works by assigning data points to the closest centroid orthogonally, 
as shown in Figure 8. Then, the centroid is shifted to the mean of all the data points closest to it. In each 
iteration, we reduce J(@) given in (8). Over many iterations, the results stabilize, and the centroid comes to a 
location that creates ideal clusters. 
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Figure 8. Clustering visualization 


2.2.2. Principal component analysis 

This algorithm reduces the dimensions of data by creating a lower-dimensional projection that 
captures the essence [13], as shown in Figure 9. It is imperative to do feature scaling and mean normalization 
before beginning in (9). Compute the mean p and standard deviation o before doing this to each element of 
each feature: 


(9) 


The reduced feature vector can be obtained by using the eigenvectors of the covariance 
matrix in (10). 


1 . . 
Fae yy Oo) (10) 
The reduced feature vector obtained can be used to increase efficiency in other algorithms. 
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Figure 9. Deriving PCA Axes 


2.2.3. Anomaly detection 

Anomaly detection is a popular approach in detecting outliers or uncharacteristic data, as shown in 
Figure 10. Anomaly detection hinges on a probability model like logistic regression and finds anomalies if 
their probability is below the threshold ¢ in (11). Probabilities are commonly found by a Gaussian 
distribution that outputs the probability a data point is typical of the dataset when provided with the mean u 
and variance o? [14]. Instead of using J(@) to evaluate performance, we can use other metrics like precision, 
recall, or Fl-score. 


1 (2) i 
p(X) = had 20* “ for a single feature 


IJ-AI Vol. 8, No. 4, December 2019: 411 — 421 


IJ-AI ISSN: 2252-8938 0 417 


_ Gj? 
p(X) = jes eee 27} “for multiple features (11) 
*OF 


If p(X) = ¢, X is OK 
If p(x) < , X is an anomaly 


Anomalies 


Figure 10. Finding Anomalies 


2.2.4. Recommender systems 

Recommender systems, however, are a little different. In this algorithm, the machine first 
understands user preferences as the parameters @. Then, it uses it to fill the data X that it can generate better 
@ from (12-14), then changes parameters again and so on until both reach high accuracy. The hypothesis is 
ho(X) = (0/)'X*. To keep @ and X in check from trying to fit the data too much and not generalizing, we 
regularize both. For the cost function and gradient, we only consider the points that have data since we may 
have an incomplete data set. 
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2.3. Reinforcement learning 

The concept of reinforcement learning has been taken from operant conditioning and behavioral 
psychology. Reinforcement learning is a part of ML concerned with maximizing reward from a given 
circumstance, as shown in Figure 11. In this particular area of ML, the machine is given a certain set of rules 
and regulations. By using these rules, the machine will look at a variety of actions to get the most optimal 
result [15], as shown in Figure 11. The process of obtaining the desired result, usually takes a lot of time, 
since it takes time for the machine to understand which action would lead to the desired outcome. 
Reinforcement learning is booming and being used by almost all software developers, including Oracle. 
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Figure 11. Reinforcement learning process 
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3. APPLICATIONS 

Many potential ML applications become possible once the data, algorithms, and infrastructure are 
all in place. Some of the ML applications and conditions that must be paid attention to when applying these 
algorithms are discussed. 


3.1. Life sciences (supervised learning) 

Supervised ML has wide implications in the field of neuroscience due to its similarities with our 
learning methods and the ability to understand complex data [16]. Firstly, it can be used to improve the 
predictive performance of measurements generated from various methods. ML can be used to understand 
intention from information produced or predict future neural activity from past ones using techniques such as 
linear regression. Secondly, linear regression can also be used to find a correlation between measured 
variables, and logistic regression or SVM can determine how indicative a variable is of the desired variable. 
It can accurately identify what outside influences trigger what parts of the brain and vice-versa. Thirdly, 
supervised ML can also be a benchmark to compare simpler models against instead of R*, since they account 
for random noise. It can also indicate how the components can be improved. Lastly, since the brain was the 
inspiration for artificial neural networks, they can also be used to model their teacher - the brain. 
Unsupervised learning techniques can assist the supervised learning methods by reducing dimensions, find 
clusters, or extract features. 

Classification algorithms like logistic regression or SVMs can be convenient in physiotherapy, as 
well. It can be used in medical imaging to separate various “X-ray or MRI images” [17]. Patients can be 
classified into various categories based on the level of pain they feel, concluded from brain MRI data. 
Classification can also be applied in the risk prediction of injuries. Decision support systems can be built with 
a combination of recommender systems that extract features and then do classification. Unsupervised ML is 
less prevalent in the field. In most cases, it needs to be coupled with supervised ML and its accuracy by itself 
is not known. Lastly, cancer and other fatal diseases can be detected early in their onset by using ML. 
Particularly, using principal component analysis to reduce features and then implementing SVM-based 
algorithms for classification has produced very accurate results (0.96/1) in the case of breast cancer [18]. 
Implementing ML can make cancer detection faster and more efficient. 


3.2. Sports (classification) 

Creating an artificial training expert system to maximize the potential of basketball athletes is an 
intriguing prospect. ML can be used to monitor fitness and wellness, amongst many other aspects of the 
athlete. The use of low power wearable devices equipped with powerful embedded processors has been 
suggested as a possible training model. These wearable devices can be used to get information about the 
physiology, biomechanics, psychology, and sociology of a trainee. This application makes use of supervised 
learning, specifically Classification, to determine the most optimal training method for training basketball 
players [19]. After determining the different types of training methods, SVM classification algorithms can be 
used to classify the basketball training type. All athletes have potential and tendencies, which can be used as 
data that can be used to predict the performance in the future. This prediction can be achieved by automated 
machine learning and predicted modeling. Creating a training method based on opponents is the best way to 
optimize the performance of a player and get the best out of him. Machine Learning can also play a role in 
protecting the health of a player through wearable technology, providing all the necessary data to create his 
diet and fitness plan and improve his performance. ML also predicts the maximum capability of a player and 
predicting when it can hinder his performance. ML also analyzes off-field data and what can bring out the 
best out of a player. ML can identify the priorities and goals of a player, and when he/she would stay the 
most motivated and give their best. Also, ML can drastically improve operational efficiency by predicting 
how the event might unfold and make the right staffing and inventory decisions. 


3.3. Finance (unsupervised learning) 

ML has become a very common feature in the finance industry. It has been used for a wide range of 
applications, from bank and credit card fraud to process automation. In ML, problems like detecting financial 
fraud can be solved using classification - which involves coming up with a solution based on previous data. 
Examples of classification problems include Recommender Systems, Spam Detectors, and Loan Default 
Prediction. When it comes to credit card fraud detection, the classification problem involves creating models 
that have enough intelligence to properly classify transactions as either legit or fraudulent, based on 
transaction details such as amount, merchant, location, time, and others. Financial fraud accounts for a 
staggering amount of money. Hackers around the world are always exploring new ways of fraud in money 
matters. Putting your faith exclusively on traditional programmed systems for detecting financial fraud isn’t 
optimal. ML is a great solution to solve this problem [20]. Modeling financial fraud as classification can very 
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well be considered as an option. Investment in various fields of technology for financial fraud has increased 
significantly over the last decade or more. 

Process automation is another application of ML in the financial industry, which is becoming 
popular by the day. The technology allows us to replace time-consuming work and increase efficiency and 
productivity. As a result, ML enables companies to optimize costs and improve some of the services they 
provide. Here are some of the automation applications of ML in Finance: (1) Virtual Assistants, (2) 
Automation in call centers, (3) Automation in paperwork, (4) Simulation of employee training. Security 
threats in finance are growing rapidly, mainly due to the number of transactions, users, and third-party 
integrations. ML algorithms prove to be extremely helpful in detecting financial fraud. For instance, banks 
can use this technology to monitor thousands of transactions using various parameters. Such a model proves 
to be pretty accurate in predicting fraudulent transactions. Financial monitoring is another case of ML used 
for financial security. Data scientists can train the system to detect a large number of micropayments and flag 
such money laundering techniques as smurfing. Data scientists make use of these models and train these 
systems to identify and stop illegal transactions. ML algorithms can play a significant role in enhancing 
network security. Data scientists train a system to identify cybersecurity threats, as ML is great in spotting 
some of these threats using thousands of parameters set. Technology can likely set a cornerstone for 
cybersecurity in the future. 


3.4. Autonomous vehicles (reinforcement learning) 

Driving involves only a certain set of actions, but these actions are dependent on various factors and 
situations, many of which cannot be predicted, including weather and road conditions. But, one of the biggest 
factors is dependent on the driving of other vehicles. A human can instinctively solve the problem based on 
the situation in front of him. A human has a certain set of actions that he can take based on the scenario in 
front of him. Capturing all these various permutations and combinations is almost impossible. ML can be 
used to predict the fastest route to any destination and also predict the most accurate time on travel based on 
road and traffic conditions. ML can drive a car without human input, mainly through reinforcement learning 
[21]. Reinforcement Learning involves providing the machine with a certain set of rules and regulations and 
also the optimal result. By using the rules, the machine must explore various actions to come up with a 
solution that provides the optimal outcome. Reinforcement learning is similar to teaching a game to someone. 
One explains how the game is played, and the machine has to find a way to win the game. It must also 
change strategy based on other players and destinations in the game. 


4. FUTURE DIRECTIONS 

Machine learning algorithms have to be constantly updated and altered based on current data so that 
it stays appropriate to the present situation. With the generation of a large volume of data, highly efficient 
data analysis methods are in great need of current biological studies [22]. The increasing amount of data 
needs fast and accurate analysis algorithms for the best results. Beyond data collection and recommendations, 
robots trained with supervised learning can one day perform surgeries instead of just assisting doctors. They 
may also be able to make medicine based on previous knowledge about drug-making procedures. 
A tremendous opportunity is present in the Autonomous Vehicle (AV) market. The market is predicted to 
reach $7 trillion annually by 2050 [20]. Regulators and legislators, who are attempting to support the 
development of intelligent infrastructure while protecting consumers, are churning out a patchwork of rules 
and regulations that differ by country and region. The volume of data produced by an autonomous vehicle 
every second is so huge, leading to challenges like recording this data, access to it, the right use of it, sharing 
this data with other vehicles, infrastructures, and finally, how the consent is collected from the end-user. An 
autonomous vehicle responds to driving scenarios based on its previous experiences. New or unexpected 
roadway or weather conditions, therefore, can lead to accidents and system failures [23]. Improved 
infrastructure like radio transmitters instead of traffic signals, implementation of advanced network systems 
in vehicles, and roads to gibe improved data on various conditions. 

Another thing ML has to figure out is to integrate autonomous vehicles with human drivers. 
Different countries and regions are addressing AV challenges through a jumble of rules and regulations that 
make it more challenging for the manufacturers to identify a clear path forward [24]. A collaborative 
environment allows technical professionals to share knowledge, discuss issues and identify standards-related 
solutions to drive innovation and accelerate emerging technology adoption [25-26]. ML involves the use of a 
lot of data and can also prove to be time-consuming. So, there must be an evaluation of what models can be 
used and the most amount of time that can be used [27-28]. Alternate solutions like deep learning may be 
applied to the problems stated above. A modern, advanced machine learning technique that makes use of 
extremely sophisticated neural networks is called deep learning. The models generated through deep learning 
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involve a lot more advanced neural networks. It is a pillar of many advanced machine learning systems 
today. Cognitive Computing is another evolving field of Aland refers to computing that focuses on 
reasoning and understanding at a higher level. Cognitive computing finds application in areas that require to 
improve decisions, reduce costs, and optimize outcomes by leveraging natural language and evidence-based 
learning. 


5. CONCLUSIONS 

In this paper, machine learning is analyzed in the context of various industrial applications, as 
represented in recent academia. The provided summaries and detailed information serve as a useful tool for 
exploring relevant techniques in concerned areas and demonstrates the research’s utility. For the three main 
categories of application identified (Life Sciences, Sports Analysis, Fraud Detection, and Autonomous 
Vehicles), interesting techniques, challenges, and opportunities have been identified. Possible courses of 
action have been presented, which can encourage discussion amongst researchers and assist new researchers 
in the field to comprehend advanced concepts for their use. Hence, machine learning algorithms must be 
continuously refreshed and refined based on data that reflect current circumstances. 
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